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The severe acute respiratory syndrome coronavirus (SARS-CoV) devotes a significant portion of its genome 
to producing nonstructural proteins required for viral replication. SARS-CoV nonstructural protein 9 (nsp9) 
was identified as an essential protein with RNA/DNA-binding activity, and yet its biological function within the 
replication complex remains unknown. Nsp9 forms a dimer through the interaction of parallel a-helices 
containing the protein-protein interaction motif GXXXG. In order to study the role of the nsp9 dimer in viral 
reproduction, residues G100 and G104 at the helix interface were targeted for mutation. Multi-angle light 
scattering measurements indicated that G100E, G104E, and G104V mutants are monomeric in solution, 
thereby disrupting the dimer. However, electrophoretic mobility assays revealed that the mutants bound RNA 
with similar affinity. Further experiments using fluorescence anisotropy showed a 10-fold reduction in RNA 
binding in the G100E and G104E mutants, whereas the G104V mutant had only a 4-fold reduction. The 
structure of G104E nsp9 was determined to 2.6-A resolution, revealing significant changes at the dimer 
interface. The nsp9 mutations were introduced into SARS-CoV using a reverse genetics approach, and the 
G100E and G104E mutations were found to be lethal to the virus. The G104V mutant produced highly 
debilitated virus and eventually reverted back to the wild-type protein sequence through a codon transversion. 
Together, these data indicate that dimerization of SARS-CoV nsp9 at the GXXXG motif is not critical for RNA 


binding but is necessary for viral replication. 


The discovery of a novel coronavirus (CoV) as the causative 
agent of severe acute respiratory syndrome (SARS; SARS- 
CoV) has highlighted the need for a better understanding of 
CoV replication (19). After emerging in late 2002 from the 
Guangdong Province in China, SARS-CoV was rapidly iso- 
lated, and its genome sequenced to reveal a new CoV that was 
phylogenetically distinct, suggesting a new classification as a 
type IIb CoV (15, 32, 43, 50). Genomic comparison to the 
closely related murine hepatitis virus (MHV), human CoV 
OC43 (HCoV-OC43), and bovine CoV revealed a highly con- 
served genomic structure with many regions nearly identical 
(52). After strict quarantine controls were initiated, SARS- 
CoV was contained, with approximately 8,000 individuals being 
clinically infected resulting in close to 800 deaths (www.who 
.int/csr/sars/en). Recently, the natural reservoir for SARS-CoV 
was reported as the Chinese horseshoe bat, indicating that (1) 
the disease is still circulating in animals and (ii) a future re- 
emergence from a zoonotic source is possible, underscoring 
the importance of continued study of this virus (36, 38). 

CoVs devote a significant portion of their positive-sense 
single-stranded RNA (ssRNA) genome to proteins related to 
viral RNA replication. SARS-CoV has a genome of ~29,700 
nucleotides, of which more than 21,000 code for the 16 non- 
structural proteins (nsp’s) (43, 50). The replicase genes open 
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reading frame la (ORFla) and ORFI1b are translated into 
large polyproteins ppla (4,300 amino acids [aa]), and through 
a —1 ribosomal frameshift mechanism, a fusion protein known 
as pplab (7,000 aa) (58). Posttranslational processing of the 
polyproteins by two distinct viral proteinases, the papain-like 
proteinase and a 3C-like proteinase (3CL?"®; also known as 
MP*°), yields 16 mature nsp’s, many of which interact to form 
the replication complex responsible for the synthesis of nega- 
tive-strand template, positive-strand genomic, and all sub- 
genomic RNAs (sgRNA) with additional roles related to 
cellular processes (23, 30, 37, 52, 54, 68). The precise stoi- 
chiometry of the replicase complex is unknown, but yeast two 
hybrid screens, glutathione S-transferase pull-down assays and 
X-ray crystallography, have revealed a number of protein-pro- 
tein interactions between the various nsps (27, 60, 67). 

In general, proteins form multimers for a variety of reasons 
including stability, allostery, and to ensure accurate translation 
of genetic information (22). Translation of large polypeptides 
can result in errors in protein sequence and therefore large 
complexes are usually assembled from multiple small proteins. 
Viruses both violate and adhere to this rule by first producing 
large polyproteins and then processing them into individual 
proteins that assemble to form active replication complexes. 
For the CoVs, the protein components and stoichiometry of 
the positive- and negative-strand replication complexes remain 
unknown. Crystal structures of individual SARS-CoV nsp’s 
have revealed several different multimeric states for the same 
protein. For example, the structure of nsp10 is reported as a 
dimer and as a dodecamer by separate groups (29, 55), al- 
though genetic data suggest that the dodecamer structure is 
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not essential for in vitro replication (14). Biochemical studies 
of nsp7 and nsp8 indicate that, independently, they are dimeric 
in solution but the crystal structure of the nsp7 and nsp8 
complex is a hexadecamer (67). For nsp9, there are four crys- 
tallographic structures which report a variety of dimeric inter- 
faces (16, 49, 57). In one case, two interfaces are present in a 
single crystal form, suggesting that a tetrameric complex that 
incorporates both interfaces may be possible. Although the 
information provided by these structures is significant, the bi- 
ological relevance to the replication complex has not been 
established. 

SARS-CoV nsp9 has been shown to have RNA and DNA- 
binding ability through a variety of methods (16, 49, 57). nsp9 
from MHV-AS9 has been shown through immunofluorescence 
studies to localize with, among others, the helicase (nsp13), 
nucleocapsid (N protein) and 3CLP*° (nsp5) (4, 5). nsp9 also 
localizes to late endosomes at sites of replication with nsp7, 
nsp8, and nsp10 and is likely a member of the replication 
complex (4). An nsp9 knockout in MHV-AS59 is not viable, 
while fusion of a nsp9-10 polyprotein is viable but attenuated 
in growth, suggesting that the mature form of nsp9 plays a 
critical role in viral replication (13). 

Several crystallographic structures of nsp9 have shown that it 
is composed of seven beta strands and a single alpha helix (16, 
49, 57). The fold of nsp9 is reported to be similar in fold to 
domains I and II of the 3CLP*° encoded within the SARS-CoV 
genome (16); however, no significant sequence similarity exists 
between the two. The presumed biological dimer utilizes the 
interaction between the lone helices of each monomer to form 
the parallel helix-helix dimer (Fig. 1A). A second dimeric form 
has a beta-sheet interface stabilized by main chain atom inter- 
actions within the sheet regions of each monomer (Fig. 1B). A 
recent crystal structure of HCoV 229E nsp9 reveals an antipa- 
rallel helix-helix dimer formed by a disulfide bond at Cys69 
(Fig. 1C). Upon generating a multiple sequence alignment 
(Fig. 1D) of various CoV nsp9 proteins, it was found that while 
the helix-helix interface found in Fig. 1A contained multiple 
conserved residues along the dimer interface and buried 
~1,000 A? of surface-exposed area, the sheet-sheet dimer in 
Fig. 1B contained no conserved residues in the dimer region 
and buried only ~500 A. A survey of dimer interfaces found 
in protein structures suggests for a protein of ~15 kDa a 
buried surface area of ~1,000 A? would be expected (2, 28). 

The common protein-protein interaction motif GXXXG 
(20, 31) is conserved at the dimer interface, allowing the heli- 
ces to closely pack at the positions of G100 and G104 (Fig. 
1A). Therefore, we designed mutagenesis experiments to in- 
vestigate the stability and function of that dimer. Mutants 
G100E, G104E, and G104V were created to disrupt the di- 
meric interface and were characterized by size exclusion chro- 
matography (SEC), multi-angle light scattering (MALS), and 
circular dichroism (CD) spectroscopy. Effects on ssRNA bind- 
ing were assessed by RNA electrophoretic mobility shift assay 
(EMSA) and fluorescence anisotropy (FA). The crystal struc- 
ture of G104E was solved and refined to 2.6-A resolution, and 
changes in the helix-helix interface were observed. 

The development of reverse genetics approaches to studying 
CoVs has also allowed us to reintroduce the G100E, G104E, 
and G104V mutations into SARS-CoV (Urbani strain) to 
study their effects on the virus in vivo (65). The G100E and 
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SARS NNLNRGMVLGSLAATVRLQ 
IBV RSIVRGMVLGAISNVVVLQ 
TGEV NTLRRGAVLGYTGATVRLQ 
HCoV 229E NNLRRGAVLGYIGATVRLQ 
MHV-A59 NTLARGWVVCTLSSTVRLQ 


FIG. 1. Dimer arrangements in nsp9. (A) nsp9 crystal structure 
(1QZ8) showing presumed biological dimer and helix-helix interface. 
The individual monomers are colored in blue/green and red/yellow, 
respectively. The positions of G100 and G104 are depicted as space- 
filling models for both monomers. G100E and G104E are labeled in 
one monomer. (B) Alternate nsp9 dimer (1UW7) stabilized through 
sheet regions. Each monomer is colored from the N terminus (blue) to 
the C terminus (red). (C) Antiparallel helix-helix dimer of HCoV- 
229E nsp9 stabilized by a disulfide linkage at C69. Each monomer is 
colored from the N terminus (blue) to the C terminus (red). (D) Mul- 
tiple sequence alignment of CoV nsp9 homologs showing absolute 
conservation of glycines equivalent to G100 (green) and G104 (cyan) 
in SARS-CoV. Images were prepared by using PyMol. 


G104E mutations were lethal to the virus, while the G104V 
mutation produced a highly debilitated growth phenotype with 
eventual transversion of the codon at position 104 from GTG 
(Val) to GGG (Gly), indicating that dimerization of nsp9 at 
the GXXXG interface is required for efficient viral growth. 


MATERIALS AND METHODS 


Mutant generation. SARS-CoV nsp9-pET23d(+) plasmid containing the 
coding sequence of nsp9 in addition to a C-terminal His, tag was a gift from 
Mark Denison (Vanderbilt University). Nsp9 G100E, G104E, and G104V 
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mutants were generated by using a QuikChange site-directed mutagenesis kit 
(Qiagen) according to the manufacturer’s protocol using the following prim- 
ers: 9G100EF, 5'-AACAACCTAAATAGAGAAATGGTGCTGGGCAGTT 
TAGC-3’; 9G100ER, 5'’-GCTAAACTGCCCAGCACCATTTCTCTATTTA 
GGTTG-3’; 9G104EF, 5’-GAGGTATGGTGCTGGAAAGTTTAGCTGCTA 
C-3'; 9G104ER, 5'-GTAGCAGCTAAACTTTCCAGCACCATACCTC-3’; 
9G104VF, 5'-GAGGTATGGTGCTGGTGAGTTTAGCTGCTAC-3’; and 
9G104VR, 5’-GTAGCAGCTAAACTCACCAGCACCATACCTC-3' (bold 
facing indicates sites of mutation). Mutagenesis products were transformed 
into DHSa cells (Invitrogen, Carlsbad, CA) and plated on LB agar plates 
(supplemented with 100 wg of ampicillin/ml) from which colonies were se- 
lected and grown in 5-ml cultures of LB supplemented with 100 pg of 
ampicillin/ml. Plasmid was purified by using a Qiagen spin miniprep kit 
according to the manufacturer’s protocol. The presence of the mutations 
within the plasmids was confirmed through DNA sequencing on an ABI Prism 
3130XL genetic analyzer (Roswell Park Cancer Institute Biopolymer Facility). 

Expression and purification. Plasmids were each transformed into 
BL21(DE3) (Novagen) cells and screened for expression conditions. Soluble 
protein was obtained through the addition of IPTG (isopropyl-B-p-thiogalacto- 
pyranoside) to 1 mM in 1 liter of culture grown in LB (supplemented with 100 
wg of ampicillin/ml and grown to an optical density at 600 nm of ~0.6 at 37°C), 
followed by shaking at 37°C for 3 h. Cells were collected by centrifugation and 
stored at —80°C. 

All purification steps were carried out at 4°C. Cell pellets from 1-liter cultures 
were thawed, resuspended in 15 ml of buffer A (50 mM Tris [pH 8.0], 300 mM 
NaCl, 10 mM imidazole) and 1 ml of protease inhibitor cocktail (Roche), and 
then lysed by using a single pass through a Microfluidizer (Microfluidics Co.) at 
~18,000 Ib/in?. The resulting lysate was centrifuged at 35,000 rpm at 4°C for 30 
min in a 45 Ti rotor (Beckman). Supernatant was filtered (0.22-m pore size) 
and applied to a 5-ml HisTrap (GE Healthcare) column pre-equilibrated in 
buffer A. Protein was washed with buffer A followed by 12% buffer B (50 mM 
Tris [pH 8.0], 300 mM NaCl, 300 mM imidazole) and eluted from the column in 
a linear gradient from 12 to 100% buffer B. Fractions were analyzed by sodium 
dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and nsp9 con- 
taining fractions were pooled and dialyzed against 1 liter of buffer C (10 mM Tris 
[pH 8], 150 mM NaCl, 1 mM EDTA, 5 mM dithiothreitol). Concentrated protein 
was subsequently applied to a Superdex 75 HL 16/60 column pre-equilibrated in 
buffer C. Fractions were analyzed by SDS-PAGE with appropriate fractions 
pooled and dialyzed against either 1 liter of buffer D (50 mM MES [morpho- 
lineethanesulfonic acid; pH 5.6], 15 mM NaCl, 0.5 mM TCEP [tri(2-carboxyethyl) 
phosphine HCl]) or 1 liter of buffer C. Final samples were stored at 4°C and 
assayed for protein concentration at 280 nm using an extinction coefficient of 
13,075 M~' cm~ ‘calculated by ProtParam (63) using the coding sequence of the 
mature nsp9 product with a C-terminal tag (LEHHHHHH). 

SEC and SEC-MALS. Aliquots of elution fractions from immobilized metal 
affinity chromatography (IMAC) purification were applied to a Superdex 75 HR 
10/30 column (GE Healthcare) preequilibrated in buffer E (10 mM NaPO, [pH 
8], 100 mM NaCl). The Superdex column was attached on an AKTA Purifier 
coupled with a Wyatt Systems MiniDAWN (three detectors) and Optilab DSP 
(Wyatt Systems) immediately downstream of the column. Protein samples were 
applied and eluted from the column at 0.5 ml/min with simultaneous collection 
of light scattering and refractive index data, using a laser excitation wavelength 
of 690 nm and light scattering detectors at 45°, 90°, and 135°. Peak analysis was 
performed by using the ASTRA software provided with the system. 

CD spectroscopy. Protein samples were passed over a Superdex 75 HL 16/60 
column in buffer F (25 mM NaPO, [pH 8], 150 mM NaCl) to remove any 
nonspecific aggregates. Appropriate fractions were pooled and dialyzed against 
1 liter of buffer G (10 mM NaPO, [pH 8.0]). CD spectra of nsp9 samples were 
gathered from 200 to 255 nm in a 1-mm-pathlength quartz cuvette at 20°C (1-nm 
steps, 50-nm/s scan speed, 4-s response time, three scans) using a Jasco J-715 
Spectrapolarimeter (Department of Pharmaceutical Sciences, University at Buf- 
falo). Secondary structure assignment was achieved through use of the K2d web 
server (www.embl-heidelberg.de/~andrade/k2d) and the DICHROWEB server 
(www.cryst.bbk.ac.uk/cdweb/html/home.html) (1, 40, 62). 

For thermal denaturation experiments using CD spectroscopy, a similar pro- 
tocol was followed. Samples were monitored while heating from 20 to 95°C using 
a Peltier device. Spectra were gathered at 5° intervals, with peak monitoring at 
205 nm. The data were analyzed using Origin 7.0. 

RNA EMSA. A labeled ssRNA oligonucleotide (Integrated DNA Technolo- 
gies) was used for gel shift assays (Biotin-5’-CGACUCAUGGACCUUGGCA 
G-3'). Oligonucleotide was resuspended in RNase-free water at 1 4M and stored 
at —20°C. 5 wg of nsp9 samples (in buffer D) were mixed with 1 »M RNA, 1 »M 
ssDNA oligonucleotide (3CL NdeI Forward2 [5'-GGTGGTCATATGAGTGG 
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TITTAGGAAAATGGCATT-3']), and RNase inhibitor (SUPERaseIN; Am- 
bion), followed by incubation for 60 min at room temperature. After incubation, 
the samples were cross-linked at 254 nm for 15 min. High-density Tris-borate- 
EDTA (TBE) loading buffer (Invitrogen) was added to the nucleic acid-protein 
mixtures, which were loaded onto Novex 4 to 20% TBE polyacrylamide gels 
(Invitrogen) and resolved at 80 V for 2 h in 0.5 TBE buffer. The gels were then 
transferred to positively charged nylon membrane (Hybond) in 0.5x TBE at 25 
V for 30’ in a semidry transfer apparatus (Bio-Rad). Membranes were cross- 
linked for 5 min in 2X SSC (1 SSC is 0.15 M NaCl plus 0.015 M sodium citrate) 
at 254 nm. The membranes were then developed using a BrightStar biodetection 
kit (Ambion) according to the manufacturer’s protocol. Extended wash steps 
were incorporated to help further decrease the nonspecific background of the 
membrane. 

Attempts were also made to compete the RNA probe off of nsp9 using an 
ssDNA template which was added at equivalent and 10-fold-higher levels com- 
pared to the RNA probe. Developed membrane was exposed to Hyperfilm ECL 
(GE Healthcare) and analyzed for electrophoretic mobility. 

FA. FA measurements were performed on a Fluoromax-4 spectrophotometer 
(Jobin Yvon Horiba) equipped with a temperature-controlled cell and polarizing 
filters. All experiments were carried out at 22°C. 5’-Fluorescein-labeled RNA (5 
FAM-CGACUCAUGGACCUUGGCAG-3’; IDT) was used in all experiments. 
FAM-RNA was dissolved in buffer (20 mM Tris [pH 7.2]) to a final concentration 
of 58 nM in a 1-ml quartz cuvette. Small aliquots (1 to 2 pl) of nsp9 or nsp9 
mutants (in 10 mM MES [pH 5.6]) were added to the cuvette covering a protein 
concentration range of 1 to 2,000 nM, followed by incubation with stirring for 5 
min between measurements. FA was measured by exciting at 490 nm and mon- 
itoring the emission at 515 nm. Slits for the excitation and emission were set to 
2.5 nm. The integration time was 2.0 s for the anisotropy measurements. FA data 
collection was controlled by using the FA module of the Fluoressence software 
(version 2.1.5.0). Each anisotropy value is the average of 10 individual anisotropy 
measurements. The relative standard deviation was <2% for all measurements. 
The data were analyzed using nonlinear regression fitting of the data to the 
following equation (42) for a single site binding model using Prism software: 


A=A,+ (Ay — A) 


1+ K, [Lr] + K, [Rr] — Vid + K, [Lr] + K, [Rr] — 4(L7)[RrlK,} 
e 2K, [Re 


ITC. Dilution isothermal titration calorimetry (ITC) is a technique amenable 
to determining protein dimerization constants (7, 9). Purified protein samples 
were subjected to buffer exchange five times in an Amicon centrifugal concen- 
trator to exchange protein in appropriate buffer (20 mM MES [pH 5.6], 15 mM 
NaCl, and 0.1 mM TCEP for wild type, GLOOE, and G104E; 20 mM Na TAPS 
{N-[Tris(hydroxymethyl)methyl]-3-aminopropanesulfonic acid} [pH 8.5], 100 
mM MgCl, and 0.1 mM TCEP for G104V). Protein samples and corresponding 
buffer were filtered to 0.2 4m (Acrodisc syringe filter, low protein binding HT 
Tuffryn membrane; PALL Life Sciences) and thoroughly degassed. 

Buffer corresponding to the sample to be studied was loaded into the reference 
and sample cells of the VP-ITC (Microcal). Typically, 28 injections of 10 wl of 
protein were made into the sample cell filled with buffer. Concentrations of nsp9 
in the injection syringe ranged from 189 1M to 6.3 mM, and concentration 
ranges in the sample cell were monitored from ~0.001 mM to ~0.3 mM (near 
the published dimerization K,). Additional measurements on samples were made 
up to a final concentration of ~1 mM within the sample cell. Samples which 
showed a saturation curve were fit to a simple dissociation model using the 
included analysis software (Origin 7) specifically designed for dilution ITC data 
analysis. 

Mutant design for reverse genetics. To determine the effects of mutations 
G100E, G104E, and G104V on SARS-CoV replication, the three mutations were 
individually engineered into the molecular clone of SARS-CoV, replacing the 
wild-type codons as follows: GLOOE, GGT to GAG; G104V, GGC to GTG; and 
G104E, GGC to GAA. This was accomplished by using the “No See’m” (66) 
approach whereby primers were designed utilizing type IIS restriction endonu- 
clease sites which allowed sticky ends to be generated with non-native nucleo- 
tides present. This was used to introduce the mutations into the codons described 
above, using the wild-type SARS-D fragment from the infectious clone of SARS- 
CoV strain Urbani (icSARS) as the backbone. The primers used were as follows: 
G100E mutant, GI00EF (5'‘-AGAGAGATGGTGCTGGGCAGT-3’) and 
GI100ER (5'-ACTGCCCAGCACCATCTCTCT-3’); G104E mutant, G104EF 
(5'-AGAGGTATGGTGCTGGAAAGT-3’) and G104ER (5’-ACTTTCCAGC 
ACCATACCTCT-3’); and mutant G104V, G104VF (5’-AGAGGTATGGTGC 
TGGTGAGT-3’) and G104VR (5'-ACTCACCAGCACCATACCTCT-3’). The 
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primers used to amplify the nsp9 cDNA were M13R (5'-AGGAAACAGCTAT 
GAC-3') and nsp9R (5'-AAGTTCCTTGAAACTGAGACG-3’). Each mutation 
was engineered in this way into the SARS-D fragment and the full-length mu- 
tated D fragments were digested, purified, and ligated back into the original 
vector. Transformation was conducted by heat shock at 42°C for 2 min, and 
transformed E. coli was plated on LB plates with appropriate selection. Plates 
were grown at room temperature for 48 h, after which five colonies were picked 
and amplified in 5 ml of LB broth incubated at room temperature, under 
appropriate selection, and with agitation. After 24 h, the plasmids were purified 
and digested with restriction endonuclease BglI to determine whether the DNA 
resembled wild type by restriction screening. Plasmids that digested correctly 
were amplified in 20 ml of LB broth with the appropriate antibiotic and grown 
for 24 h at room temperature with agitation. The plasmids were again purified, 
and three of each mutant were sequenced to verify that the mutation was 
incorporated into the wild-type SARS-D fragment. 

Assembling the mutant viruses. The infectious clone of the Urbani strain of 
SARS-CoV was used as the backbone for this project, and mutated SARS-D 
fragments containing each of the mutations of interest were assembled into the 
full-length infectious cDNA as described previously (64, 66). Briefly, icSARS 
fragments A through F were amplified in E. coli (TOP-10; Invitrogen), purified, 
and screened by restriction digestion. Large stocks of each fragment that 
screened correctly were established and digested with the appropriate enzyme as 
follows. icSARS fragments B, C, D and E were digested with BglI. icSARS 
fragments A and F were digested with EcoRI and NotI, respectively. icSARS 
fragments A and F were then dephosphorylated and digested with Bgll. 

The digested bands were subjected to electrophoresis, excised, and gel purified 
(Qiagen), and ligation reactions were set up using equivalent molar ratios of each 
fragment and ligase (Roche). Wild-type fragments were ligated to generate a 
full-length cDNA of wild-type SARS-CoV, and the SARS-D fragment bearing 
each mutation was used to generate full-length cDNAs for each mutant. The 
ligation reactions were purified by chloroform extraction and isopropanol pre- 
cipitation, and nucleocapsid cDNA and full-length viral genomic cDNA were 
used as templates for in vitro transcription reactions (Ambion). Nucleocapsid 
and full-length viral genomic transcripts were then electroporated into Vero 
cells. Transfections were monitored for 3 days and passaged if necessary. All 
recombinant icSARS strains were propagated on Vero E6 cells in Eagle minimal 
essential medium (Invitrogen) supplemented with 10% fetal calf serum (Hy- 
Clone, Logan, UT), kanamycin (0.25 g/ml), and gentamicin (0.05 yg/ml) at 
37°C in a humidified CO, incubator. All work was performed in a biological 
safety cabinet in a biosafety level 3 laboratory containing redundant exhaust fans. 
Personnel were equipped with powered air-purifying respirators with high-effi- 
ciency particulate air and organic vapor filters (3M, St. Paul, MN), wore Tyvek 
suits (DuPont, Research Triangle Park, NC), and were double gloved. 

Verification of viral replication. The cells were analyzed daily for cytopathic 
effect (CPE), and viral subgenomic transcription was verified by reverse tran- 
scription-PCR (RT-PCR) of leader containing transcripts of the S gene. Briefly, 
primers pairs were designed to amplify the leader RNA sequence and ~300- 
nucleotide domain into the S gene. Total RNA was harvested from transfected 
cells by using TriZol reagent, and purified RNA was reverse transcribed using 
random hexamer primers and Superscript III reverse transcriptase, according to 
the manufacturer’s protocol. S gene specific leader containing cDNA was then 
amplified by PCR. 

Analysis of mutant G104V RNA replication. The SARS-CoV infectious clone 
bearing the G104V mutation was assembled in triplicate as described above and 
transfected via three independent cuvettes containing 8 x 10° Vero cells. The 
transfected cells from each cuvette were then divided into three T25 flasks, and 
medium was added; therefore, three independent transfections were each di- 
vided into three flasks for a total of nine flasks. One flask from each transfection 
was harvested at 12, 24, and 36 h posttransfection. As a control, icSARS was also 
transfected into cells and harvested in the same way. In each case, transfected 
cells were isolated in TRIzol reagent (Invitrogen) using the manufacturer’s 
protocol for isolation of total RNA by isopropanol precipitation and eluted to 50 
wl in nuclease-free water. All samples were then incubated for 1 h at 37°C with 
DNase (Applied Biosystems, Foster City, CA) according to manufacturer’s pro- 
tocol to eliminate any residual input cDNA. 

Total RNA from each time point and transfection was reverse transcribed to 
cDNA using SuperScript II (Invitrogen) with modifications to the protocol as 
follows. Random hexamers (300 ng) and total RNA (5 wl) were incubated for 10 
min at 70°C. The remaining reagents were then added according to the manu- 
facturer’s recommendations, and the reaction was incubated at 55°C for 1 h, 
followed by 20 min at 70°C to deactivate the reverse transcriptase. Quantitative 
real-time PCR was conducted using SmartCycler II (Cepheid, Sunnyvale, CA) 
with SYBR green (diluted to 0.25x; Cepheid) to detect subgenomic cDNA with 
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primers (7.5 pmol) optimized to detect from the leader sequence to the 5’ end 
of the N gene (forward N1S, AAAGCCAACCAACCTCGATC; reverse NIA, 
GCGTCCTCCATTCTGGTTAT) or genomic cDNA with primers (7.5 pmol) 
designed to detect genomic ORFla cDNA (forward G104Vf, AGAGGTATGG 
TGCTGGTGAGT; reverse nsp9r, AAGTTCCTTGAAACTGAGACG). The 
cDNA from the RT reaction mixtures of each virus was used, at a volume of 5 pl 
for each reaction, with a total reaction mixture volume of 25 wl. Omnimix beads 
(Cepheid) containing all reagents except SYBR green, primer, and template 
were used to standardize the reaction conditions, and template concentrations 
were normalized by concentrations of the housekeeping gene GAPDH (glycer- 
aldehyde-3-phosphate dehydrogenase; forward GAPDHF, CATGGGGAAGG 
TGAAGGTCG; reverse GAPDHR, TTGATGGTACATGACAAGGTGC). In 
addition, all products were verified by melting curve analysis. 

Crystallization, cryopreservation, and diffraction. Purified nsp9 G104E was 
crystallized by the hanging-drop vapor diffusion method utilizing a Hangman 
system (41). Protein was concentrated to 5.8 mg/ml and mixed in a 1:1 ratio with 
crystallization cocktail (1.7 to 1.8 M ammonium sulfate, 0.1 M citrate phosphate 
buffer [pH 4 to 4.3]). Hexagonal rods generally appeared within 1 week and grew 
to a maximum of 0.3 by 0.1 by 0.1 mm. 

Crystals were transferred for 30 min to 1 h into cryobuffer (2 M ammonium 
sulfate, 0.1 M citrate phosphate buffer [pH 4.0 to 4.2], 25% [vol/vol] glycerol). 
Cryoprotected crystals were mounted onto CryoLoops (Hampton Research), 
quickly plunged into liquid nitrogen, and stored in a cryodewar. 

Diffraction experiments were remotely conducted on beamline 11-1 of the 
Stanford Synchrotron Radiation Laboratory (SSRL) using the Blue-Ice interface 
and Web-Ice analysis software (21, 44). The data were collected at a wavelength 
of 0.979 A using a 20-s exposure and 1° oscillation per frame on a MAR-325 
charge-coupled device detector. The data were indexed, integrated, merged, and 
scaled to 2.6-A resolution, as implemented within the HKL2000 package (47). 

Phasing and refinement. Processed data was input in the CaspR (10) server for 
molecular replacement trials using a variety of search models, including the 
previously solved nsp9 structures (Protein Data Bank codes 1QZ8 and 1UW7) 
(16, 57). The best solution was found using monomeric 1QZ8 (16) as a search 
model and resulted in a model with four copies of nsp9 within the asymmetric 
unit. 

The nsp9 G104E model was subjected to a single round of rigid body refine- 
ment as implemented within RefmacS as part of the CCP4 package (3). The 
model was then refined using alternating cycles of manual fitting in Coot and 
TURBO-FRODO (8, 17) and simulated annealing in CNS 1.2 (6). Composite 
simulated annealing omit maps were generated at regular intervals to guide the 
modeling efforts. After several rounds of rebuilding group B factors were refined 
in conjunction with 50 cycles of energy minimization (CNS). In the final rounds 
of rebuilding and refinement, waters, ions, and cryoprotectant molecules were 
added to the model through the use of an F,-F, map and a 2F,-F, omit map 
contoured to 3 o and 1 o, respectively. Model quality was monitored by using 
CNS and Molprobity (11). The final structure containing four monomers of nsp9, 
134 waters, eight PO,, and 16 glycerols has been deposited into the Protein Data 
Bank (PDB) with the accession code 3EE7. The data and refinement statistics 
are listed in Table 1. 

Structure analysis. The individual monomers of nsp9 G104E were aligned 
(least-squares superposition) with one another and with monomers from 1UW7 
(57) and 1QZ8 (16). Dimer-only structures were also created from the G104E 
structure and superimposed on the analogous dimers from 1QZ8 (16) or 1UW7 
(57). In all cases, a C,, root mean square deviation (RMSD) was calculated by 
using either PyMol (12) or Coot. In the case of features such as the symmetry 
related helix-helix dimer formed by G104E, manual inspection (Coot) was made 
of the helices compared to 1QZ8 (16). Further examination of the helix-helix 
dimerization interface was accomplished through ProMotif, as found within the 
PDBSum webserver (26, 33-35). 


RESULTS 


Expression and purification. Variants of nsp9 were gener- 
ated with the purpose of disrupting the helix-helix dimer found 
in the crystal structure of nsp9 as solved by Egloff et al. (16) to 
assess its biological relevance. Mutations at G100 and G104 
were created and verified through DNA sequencing. Wild-type 
nsp9, along with the G1O0E, G104E, and G104V mutants, were 
found to induce to high levels of soluble expression upon 
induction with IPTG. 
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TABLE 1. X-ray data and refinement statistics 


General category and 


specific parameter value’ 

Crystal data 

PACS PTOUP isiseisccessivesethedcevesesercedeesieteevecnceriaveeste P31 

CA) tse thas ate acted nt os ch) Sh Bao enced 91.836 

b (A)... 91.836 

¢ (A) cesses 84.217 

O65 B52) evista ies cc adele dechssteestsrtesasteasivivense 90, 90, 120 

MOV AUS eencttccs nite ccaceancntdametiateeta 4 

VIM siesiceens 3.81 

To SOW cass sssccsseszisicssceusiessasicssucitesicaésensitorse 67.76 
Data collection 

Resolution range (A) vesccesssssssssetessscssssseseeseeesees 79.566-2.6 (2.69-2.60) 

Wavelength (A) 0. 

No. Of ObServatiONns .........ccceeceseeseeeeeeeeeeeeeeees 235,950 


Data completeness ...icsccscsisccersescesteseesecscessaaseiee 98.8 
d 
Rviciige, sasetiaissevsests copes lates lessitsclstionanasseusacsanseenese 5.3 (39.4) 
Refinement 
No. of reflections used watlet reel atiieen teen 24,148 


Resolution range (A) ......sssscscssssssssssesssessessons 79.566-2.6 
Free set size (%)......... . 


No. of atoms (protein)... 3,462 
No. of atoms (NONProteiN)...... eee 252 
Ryser sci ariecenstertestie ata cniuamuevtnnnamuasien 0.2138 
PAD Nate oars 0.2694 
Mean B vale (AX) ).:issccsicscssisseccccbessisievesciessiones 70.66 
RMSD 
Bond length (A) ..essseessssssssccccsssssssssnseeseesessees 0.006402 
Bond ‘angle: (°) ccssssisssseisicecssseccicisosissovsissserisiss 1.69421 


Ramachandran plot 
Favored (%) 
Allowed (%) 


“Numbers in parentheses are for the highest resolution shell. 
» Molecules per asymmetric unit. 
© Matthews coefficient. 


d Sani Una = Ciua)l)) 


‘merge = > Where I),,,; is the intensity of an individual 
DnntiTnes) : 


measurement of the reflection with Miller indices h, k, and /, and (I,,,) is the 
mean intensity of that reflection. 
eR Snta(||FOBS jaa] — |Fealcpll) 
il |Fobsj| 
served and calculated structure amplitudes. 
f Rye 18 Ryorx calculated with the free set data only, which were omitted from 
refinement. 


, where |Fobs,,,| and |Fcalc;,,| are the ob- 


Wild-type, G100E, G104E, and G104V forms of nsp9 were 
purified using IMAC (HisTrap HP; GE Healthcare), which 
yielded protein of 85% purity, aliquots of which were used for 
size estimation by SEC and SEC-MALS. Further purification 
to obtain stocks for crystallization, EMSA, and FA analyses 
was obtained by applying protein from the IMAC column to a 
preparative gel filtration column (Superdex 75 HL 16/60; GE 
Healthcare). The final protein yield was of ~98% purity as 
judged by SDS-PAGE stained with silver (data not shown). 
Due to the presence of a noncleavable C-terminal His, tag and 
linker, the pure nsp9 monomer had an expected molecular 
mass of 13.4 kDa. The structure of the G104E mutant in 
addition to the wild-type nsp9 structures found in the PDB 
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TABLE 2. nsp9 sizes estimated by SEC-MALS and SEC 


Size (kDa) 
Protein 
Monomer Superdex 75 SEC-MALS 
nsp9 wild type 13.4 26.4 25.5 
nsp9 G100E 13:5 18.5 15 
nsp9 G104E 13.5 17.8 16.3 
nsp9 G104V 13.5 17.26 13.64 


suggested that the location of the C-terminal His, tag does not 
affect the stability of the helix-helix dimer. 

Size estimation using SEC and SEC-MALS. For analysis of 
the mutants’ oligomeric state, aliquots of peak fractions from 
IMAC were immediately applied to an analytical gel filtration 
column. The elution from the Superdex column was directed to 
a Wyatt Systems Mini-DAWN coupled with an Optilab DSP to 
allow for simultaneous collection of SEC data in conjunction 
with MALS data. 

While SEC is commonly used to estimate the molecular 
weight of a sample and thus the oligomerization state, the 
molecular weight estimates are highly dependent on the con- 
formation of a protein and any interactions between the pro- 
tein and the chromatographic matrix. In effect, a protein with 
a nonglobular structure or one that interacts with the SEC 
matrix by another property other than size (e.g., electrostatic, 
hydrophobic, etc.) will elute at a volume corresponding to a 
protein with a much larger size. Size estimates made by MALS 
are independent of nonideal SEC behavior and have been 
shown to give a more accurate size determination for a variety 
of proteins than is possible using SEC alone (18). The addition 
of the SEC to the system has the added benefit of effectively 
separating all species within a sample prior to their analysis by 
MALS, leading to data which is more amenable to size assign- 
ment and has an expected error of only +5% (18). 

Nsp9 wild type eluted as a single peak and light scattering 
estimated a molecular size of 25.5 kDa, corresponding to a 
particle size very close to the predicted value of 26.8 kDa for a 
dimer. In contrast, the SEC-MALS data on the nsp9 mutants 
gave molecular size estimates in good agreement with the 
protein being monomeric (Table 2). Molecular size estimates 
were also made using elution data from SEC alone by calcu- 
lating a calibration curve using well-characterized proteins, 
also shown in Table 2. Sutton et al. reported K, values of 0.16 
to 0.46 mM for the formation of the dimer (57). However, our 
gel filtration experiments on wild-type nsp9 using an analytical 
column indicate that dimerization occurs at protein concentra- 
tions between 40 and 105 yM. Similar experiments showed 
G100E, G104E, and G104V mutants were monomeric at or 
above concentrations where the wild-type protein was dimeric. 

The transition of nsp9 wild type from dimer to monomer 
could not be detected using ITC. The concentration range 
required to capture the dissociation from dimer to monomer 
was too low to generate any substantial change in enthalpy 
above the background. The G100E and G104E samples be- 
haved in ITC experiments as monomers and showed no signs 
of a monomer-dimer equilibrium. A weak dimerization con- 
stant of 0.871 mM was determined for G104V. 

Secondary structure analysis. Analysis of nsp9 samples 
through the use of CD spectroscopy showed that the overall 
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TABLE 3. nsp9 structural features as measured by 
CD spectroscopy 


% 
Saniple Mean 7,,, + SD 
sheet Helix Random CC) 
content 
nsp9 wild type 39 9 51 62.5 = '1.6, 
nsp9 G100E 46 8 46 60.5 + 3.8 
nsp9 G104E 39 9 51 75 + 0.7 
nsp9 G104V 44 9 47 58.6 + 2.3 
Expected value 41 6 52 NA’ 


“NA, not applicable. 


secondary structure composition was maintained in all sam- 
ples. As shown in Table 3, the secondary structure distribution 
in the wild-type, G100E, G104E, and G104V proteins was 
estimated from the CD data. Given that structure assignment 
for all samples was nearly identical to percentages predicted 
from the crystal structure, it was determined that the disrup- 
tion of dimerization seen in the SEC and SEC-MALS data was 
not due to protein misfolding or the C-terminal His tag. In par- 
ticular, the amount of a-helix remained similar to the wild 
type, indicating that the mutations had not affected the helical 
structure of the dimerization interface. In addition, the crystal 
structure of the G104E mutant (discussed below) showed that 
secondary structure was not disrupted by the mutation. Mea- 
surement of the melting temperature of the samples using CD 
spectroscopy revealed that wild-type, G100E, and G104V pro- 
teins have similar melting temperatures (58.6 to 62.5°C), while 
G104E protein had a melting temperature that was signifi- 
cantly higher (75°C). While it is unclear why G104E is more 
stable, it does indicate that the individual mutations do not 
destabilize nsp9. 

Gel shift assays. In order to determine whether the muta- 
tions and subsequent disruption of dimerization altered the 
ability of nsp9 to bind to RNA, we performed RNA EMSAs 
using standard techniques. To maximize the visualization of 
RNA-protein complexes, samples were UV cross-linked prior 
to separation of complexes from free RNA through electro- 
phoresis (24, 25). RNA EMSA experiments revealed that wild- 
type nsp9 exhibited clear binding with the RNA templates (Fig. 
2). Bovine serum albumin controls did not bind RNA (data not 
shown). Importantly, the G1O0E, G104E, and G104V mutants 
were also able to bind ssRNA (Fig. 2). To date, the RNA- 
binding surface for nsp9 has not been determined; however, 
the ability of the mutants to bind suggests that the dimer 
interface is not critical for RNA binding. Changes in mobility 
were seen between the wild type and the mutants, but this 
could be a result of the change in protein charge or size of the 
complexes due to the monomeric or dimeric state of the pro- 
tein. Unlabeled ssDNA probe at a 10-fold excess was unable to 
compete with the ssRNA for binding and resulted in better 
resolution of the nsp9/RNA bands (Fig. 3). 

FA. While EMSA allowed for the visualization of the inter- 
actions between the RNA and nsp%9, it did not allow for quan- 
titative measurements of the binding affinities and comparisons 
to known data. Consequently, RNA-binding measurements were 
made using a fluorescently labeled 20-mer of ssRNA on a 
fluorometer equipped for measurement of FA data (Fig. 4). 
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FIG. 2. Wild-type nsp9 along with mutants bind ssRNA. Lane 1, 
nsp9 wild type/RNA (WT); lane 2, G100E/RNA; lane 3, G104E/RNA; 
lane 4, G104V/RNA; lane 5, ssRNA. The positions of free probe and 
protein/RNA shifts are indicated. Protein-RNA complexes were re- 
solved on a 4 to 20% TBE gel and observed after detection of the 
biotinylated ssRNA. 


Wild type bound to ssRNA with a K, of 55 nM compared to 
K,s of 250, 620, and 320 nM for G104V, G104E, and GLO0E, 
respectively. 

Reverse genetics. The three mutants were designed for and 
incorporated into the wild-type SARS-D fragment of icSARS. 
The three mutant SARS-D fragments were assembled, ampli- 
fied, sequence verified, and placed into the infectious clone of 
SARS-CoV. The full-length cDNAs were then transcribed and 


FIG. 3. ssDNA does not prevent a ssRNA from binding to nsp9. 
Wild-type nsp9 was loaded into lanes 1, 4, and 7. nsp9 G100E is found 
in lanes 2, 5, and 8. G104E nsp9 is loaded into lanes 3, 6, and 9. Lanes 
1 to 3 contain only RNA and protein, lanes 4 to 6 contain equivalent 
amounts of ssRNA and ssDNA, and lanes 7 to 9 contain an 10-fold 
excess of ssDNA over ssRNA. Lane 10 is an RNA-only loading con- 
trol. 
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FIG. 4. FA measurement of RNA binding by wild-type (WT) (@), 
G100E (#), G104E (¥), or G104V (A) nsp9. Each measurement is an 
average of 10 individual measurements. The data were fitted to the 
equation given in Materials and Methods to generate the curves de- 
picted in the graph and derive binding equilibrium K, values. 


transfected into Vero cells and monitored for CPE. In all 
cases, detection of CPE was not evident. To determine 
whether low-level RNA synthesis occurred, RT-PCR was con- 
ducted to assay for leader containing transcripts using primers 
designed to detect subgenomic S, but no bands were detected, 
suggesting either that no sgRNA synthesis was occurring or 
that transcript levels were below the levels of detectability. 
Mutants G1O0E and G104E were introduced into the infec- 
tious clone three times, and in all three cases no recombinant 
mutant virus was detected or rescued. However, recombinant 
wild-type icSARS prepared in parallel did produce CPE and 
sgRNA synthesis. For mutant G104V, the experiment was re- 
peated four times. The first two times there was no evidence of 
viral replication by cytopathology or by RT-PCR. The third 
time, however, CPE was evident, and subgenomic S was de- 
tected by RT-PCR. In addition, the recombinant virus was 
successfully passaged in Vero cells and also produced wild 
type-sized plaques. Total viral RNA was harvested and the 
region flanking the mutation site was amplified by RT-PCR, 
electroporated on a 0.8% agarose gel, gel purified, and se- 
quenced. Interestingly, sequence analysis revealed that the 
G104V mutation had reverted to wild type-SARS-CoV at that 
position. The mutated codon introduced was GTG which en- 
codes a Val, and this reverted to GGG which encodes a Gly. 
Interestingly, the wild-type virus had a Gly at that position, but 
the codon was GGC. Since two changes were originally intro- 
duced and only one change was sufficient to revert it to a 
wild-type sequence at that site, the other mutation was still 
present, indicating that it was indeed a revertant and not an 
accidental contamination with wild type-SARS D fragment. 
This suggests that the G104V mutant likely replicates very 
inefficiently, which allowed the revertant to evolve. The G104V 
mutant was generated and tested a fourth time, with results 
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identical to those of the first two trials, demonstrating the 
rarity of this event. 

Crystallization and structure determination. Crystals of 
nsp9 G104E grew in a hexagonal, rod-like morphology and 
diffracted to a maximum of 2.6-A resolution on beamline 11-1 
of SSRL. No significant disruptions in the secondary structure 
were seen, with nsp9 G104E exhibiting seven beta strands and 
one alpha helix as previously observed in LUW7 (57) and 
1QZ8 (16). As determined by molecular replacement (Fig. 
5A), four monomers of nsp9 G104E are present within the 
asymmetric unit. In four regions residues could not be modeled 
due to the lack of 2F,-F, omit density (chain A residues 36 and 
37 and residues 60 and 61, chain C residues 37 and 38, and 
chain D residue 37). This arrangement forms a helix-like bun- 
dle of the individual monomers through two separate motifs, 
the previously observed sheet-sheet dimer interface and a new 
loop-sheet interface, with each monomer oriented roughly per- 
pendicular to each other along the helix axis. The helix-helix 
dimer interface is generated through crystallographic symme- 
try (Fig. 5B); however, the interface is structurally distorted 
and will be discussed below. The helical arrangement of nsp9 
G104E monomers forms a continuous strand parallel to the b 
axis of each unit cell throughout the crystal by crystallographic 
symmetry, which continually exposes a charged surface along 
the edge of the helix (Fig. 5C). At the center of the asymmetric 
unit is the previously observed sheet-sheet dimer interface 
which has a C, RMSD of 1.92 A compared to the sheet-sheet 
dimer reported in 1UW7 (57). The G104E sheet-sheet dimer 
introduces an additional six hydrogen bonds and 11 non- 
bonded interactions compared to 1UW7 (57). A loop-sheet 
interface is present at either end of the central sheet-sheet 
dimer and results in the generation of the helical motif in the 
asymmetric unit. This interface is formed through the associ- 
ation of the loop between beta strands 6 and 7, with strand 6 of 
an opposing monomer. Residues Asp78-C and Asp78-B make 
hydrogen bonds with residues Asp47-D/Lsy86-D and Asp47- 
A/Lsy86-A, respectively. The two loop-sheet interactions 
present in the tetramer each bury roughly 800 A? of surface 
area. Moderate sequence conservation is found in these resi- 
dues. 

In the G104E mutant structure, the helix-helix dimer is 
present as in both 1QZ8 (16) and 1UW7 (57) and is thought to 
be the biologically relevant dimer, as seen in Fig. 5B. The 
helix-helix interface is disturbed but not disrupted completely, 
with the presence of Glu instead of Gly at the interface causing 
the C, positions at position 104 to move to 6.56 A apart, 
compared to 4.76 A in 1QZ8 (16) and 5.4 A in 1UW7 (57), also 
shown in a superposition in Fig. 5B. The distances between 
other atoms in the helices are essentially maintained, com- 
pared to 1QZ8 (16) and 1UW7 (57), with an overall RMSD 
over the helices of 0.66 A. In addition, the helical crossing 
angles are markedly different for G104E (—50.9°) compared to 
1QZ8 (16) and 1UW7 (57) (—41.1° and —38.7°, respectively). 
The helix-helix interface is held together by hydrogen bonds 
that form between either the OE1 or OE2 atoms of the Glu 
side chain and the main-chain nitrogen of Leu 9 of the oppos- 
ing strand (Fig. 5D). A number of phosphate ions were found 
within the structure coordinated by surface charged residues 
thought to be involved in nucleotide binding. Specific overlaps 
upon superposition of 1QZ8 (16) with our structure occur at 
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FIG. 5. Tetrameric arrangement of G104E crystals. (A) G104E structure exhibiting sheet-sheet dimer (center of structure, yellow and blue 
monomers) and loop-sheet interface (flanking sheet-sheet dimer on either side, red/yellow or blue/green monomers) present within the asymmetric 
unit containing four monomers of nsp9. (B) Helix-helix dimer in the G104E structure, similar to the helix-helix dimer in 1QZ8. Alignment of the 
two structures is depicted with 1QZ8 in red and G104E in green. Significant deviations in both crossing angles and buried surface area arise as 
a result of the G104E mutation. (C) Electrostatic surface of helix-like arrangement of G104E monomers along the b axis of a unit cell. Colors: 
blue, negative surface; white, neutral surface; red, positive surface. (D) Interactions stabilizing the helix-helix dimer. Primary contacts are formed 
between G104E and L9 of a symmetry related monomer. Distances between atoms are in angstroms. The L9’ and G104E’ designations indicate 
residues are a from symmetry-related monomer. Images were prepared by using PyMol. 


sulfate 693 of 1QZ8 (16) with glycerols 3 and 8 and phosphates 
1 and 2 of G104E. If the phosphates are taken as representa- 
tive binding areas for backbone phosphates of target nucleic 
acid, then the nearby residues are potential sites to target for 
mutagenesis examining RNA or DNA binding. The overall 
symmetry motifs generated by crystallographic symmetry in 
1QZ8 (16) and 1UW7 (57) are not replicated in our structure 
due to the presence of the loop-sheet interaction and the 3, 
screw-axis. 


DISCUSSION 


SARS-CoV, like other CoVs, produces a number of small 
highly conserved RNA-binding proteins essential to viral rep- 
lication. We have used a combination of structure guided mu- 
tagenesis and reverse genetics to evaluate the role of nsp9 in 
replication. In the present study, G100 and G104 of aGXXXG 
dimerization motif were mutated to determine whether the 
nsp9 homodimer could be disrupted at the conserved helix- 
helix interface and whether the homodimer was relevant to 
virus replication. nsp9 mutants were analyzed in vitro by SEC- 
MALS, CD spectroscopy, EMSA, and FA. The crystal struc- 
ture of G104E nsp9 was solved, and disruption of the helix- 


helix interface was observed. In addition, each nsp9 mutant 
was introduced into the virus using reverse genetics, and the in 
vivo effects were monitored. The biochemical data strongly 
suggest that mutation of either glycine results in the produc- 
tion of monomeric nsp9 in vitro and nonviable virus in vivo. 

nsp9 mutations disrupt dimerization. An investigation of a 
nonredundant PDB set revealed that the GXXXG motif being 
examined in nsp9 was well represented among soluble proteins 
(31) but has not been extensively studied for its contributions 
to homo- and heterodimerization of soluble proteins contain- 
ing the motif. The motif was also commonly found in trans- 
membrane helices, where it was postulated to allow for close 
approach of adjacent helices for subsequent dimerization. In 
several examples, mutation of either Gly in the motif lead to 
decreased or complete loss of dimerization of transmembrane 
helices (45, 51, 56). The SEC-MALS and SEC analyses in 
conjunction with EMSA data showed that G100E, G104E, and 
G104V mutants were monomeric in solution, while wild-type 
nsp9 was dimeric, indicating that the GXXXG motif was im- 
portant for maintaining the nsp9 dimer. This also indicates that 
the helix-helix dimer and not the sheet-sheet dimer is the 
major dimer in solution. 
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RNA binding by nsp9. The results from the EMSA of wild- 
type nsp9 and the GLOOE, G104E and G104V mutants show 
that each protein binds RNA, indicating the dimeric state of 
the protein is not critical for RNA binding. nsp9 apparently 
utilizes RNA as the preferred ligand. An unlabeled DNA 
oligonucleotide could not compete with labeled RNA during 
an EMSA, which confirms reports by Egloff et al., who sug- 
gested that nsp9 preferred RNA over DNA (16). 

In order to distinguish quantitatively between wild-type and 
mutant RNA binding affinities, FA measurements were used. 
FA measurements using an ssRNA 20-mer give K, values of 55 
nM for wild type and a range from 250 to 650 nM for the 
mutants, which is comparable to the values obtained by Egloff 
et al. in which a larger fragment of RNA (560 bases) was 
reported to have a binding constant of 400 nM, using an assay 
monitoring the change in tryptophan fluorescence (16). The 
equilibrium binding constants presented here were determined 
at a lower ionic strength (15 mM NaCl) and may be lower due 
to coulombic effects. The G100 and G104 mutants have K, 
values higher than that of the wild type ranging from 250 to 650 
nM. Disruption of the nsp9 dimer does not abrogate binding of 
RNA but does result in a 5- to 12-fold decrease in affinity. 

The relatively low affinity binding constants may reflect that, 
as part of the replication complex, nsp9 may not have a specific 
binding sequence but may act in conjunction with nsp7 and 
nsp8 as a processivity factor. Among the SARS-CoV nsp’s, 
nsp9 has already been identified to have binding in low micro- 
molar affinity for ssRNA and ssDNA (16, 57), nsp10 was re- 
ported to have high micromolar affinity for dsRNA (29), while 
nsp8 and the nsp7-8 hexadecamer were also shown to bind 
double-stranded RNA and double-stranded DNA with mod- 
erate efficiency (67). The interaction of these RNA-binding 
proteins with relatively weak binding affinities suggests that the 
presence of a complete replication complex may be necessary 
for high-affinity binding. nsp9 has been shown in several cases 
to interact with these proteins through several different meth- 
ods (57). A specific target RNA sequence for nsp9 within the 
SARS-CoV genome has not been identified, although interac- 
tions with a stem-loop in the 3’ region of the genome were 
recently reported (69). 

Local conditions could also affect binding of nucleic acid to 
these proteins. CoVs use large membrane-associated replicase 
complexes found within viral generated double-membrane ves- 
icles to achieve viral replication (53); thus, it can be assumed 
that the local concentration of RNA around nsp9 would be 
markedly higher than nanomolar levels. Alternatively, multiple 
weak binding sites in nsp9 may contribute to the formation of 
the viral replication complex without altering the individual 
nsp9 proteins. In addition, the presence of the replication 
complex may serve to introduce protein-RNA interaction sites 
that are not available for the individual nsp’s. 

Dimerization is critical for viral function. Several experi- 
ments suggest that the nsp9 dimer is required for viral growth. 
In vitro, the SEC-MALS data show that the dimerization of 
nsp9 was successfully disrupted with the GLOOE, G104E, and 
G104V mutations. In vivo, the G1O0E and G104E mutations in 
nsp9 were lethal to efficient virus replication in cell culture, 
based on detection for subgenomic mRNAs. In contrast, the 
G104V mutation displayed a single example of delayed growth 
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kinetics with reversion to an alternate codon for wild-type virus 
production. 

Analysis of the genomic RNA and subgenomic mRNA using 
real-time PCR suggested that, while subgenomic synthesis was 
not detected at any time, genomic RNA increased slightly at 
early time points before decreasing at later times during infec- 
tion. This increase was not significantly different than back- 
ground (data not shown), but this observation leaves open the 
possibility that genomic plus-strand RNA synthesis occurred at 
a reduced rate in the G104V mutant. van Marle et al. con- 
structed a mutant incorporating a single point mutation in a 
nsp that only made genome length molecules but not mRNA in 
the related equine arteritis virus (59). Low-level genomic rep- 
lication similar to that observed in equine arteritis virus may 
have provided the opportunity for the reversion to occur in the 
G104V mutation. An additional possibility would be that an 
error in the T7 transcription step could have introduced the 
revertant nucleotide in the G104V mutant; however, the error 
rate for the T7 RNA polymerase is one error per 20,000 nu- 
cleotides and, when this error rate is factored in with a genome 
size of 29,751 nucleotides, the probability of an error occurring 
in that specific nucleotide would be ~1/500,000,000. 

Functionally, it would seem that during infection nsp9 must 
form a dimer to properly bind and orient RNA for subsequent 
use by the replicase machinery despite the ability of the mu- 
tants to bind RNA in vitro. It is unlikely that the nsp9 dimer 
mutations have an effect on processing of the ORFla and 
ORF lab polyproteins. The MHV nsp9-10 fusion studied by 
Deming et al. (13) was shown to be functionally competent for 
viral replication, indicating that nsp9-10 did not need to be 
processed for replication to occur. There is some evidence 
using light scattering that an nsp9-10 construct is dimeric in 
solution (Z. J. Miknis and L. W. Schultz, unpublished results). 
Further modeling indicates that the C termini of nsp9 can line 
up with the N termini of nsp10 and still maintain the dimer 
structures of both proteins. Therefore, the nsp9-10 fusion may 
be functionally identical to its wild-type processed counter- 
parts. 

Interestingly, the G104V mutation was still able to produce 
revertant viruses, despite SEC and SEC-MALS data indicating 
that it is monomeric. ITC data suggest that G104V can form a 
dimer, but ~8-fold more weakly than the wild type, as esti- 
mated by SEC. Dimerization has been shown in the case of the 
RNA-binding protein NS1 of influenza virus to be necessary 
for RNA binding (61), as well as for the She2P protein from 
yeast (46). The hepatitis A virus 3C protease has affinity for 
viral RNA that is increased from millimolar to micromolar 
levels upon dimerization of the enzyme (48). G104V may be 
able to form a weakly associated helix-helix dimer that is still 
able to function at a reduced level within the virus, allowing 
sufficient virus replication to evolve a revertant strain. 

While the nsp9 dimerization mutants are able to bind to 
RNA effectively, the mutations could also disrupt interactions 
that allow them to form hetero-oligomers with other SARS 
proteins. In rabies virus, the nucleoprotein shows no sequence 
specificity for RNA; however, when it is incorporated into the 
ribonucleoprotein complex that is ultimately responsible for 
RNA transcription, the specificity is gained for genomic RNA 
(39). A similar mechanism may exist in SARS, where nsp9 can 
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switch from nonspecific binding to specific binding upon for- 
mation of the replicase complex. 

Structural analysis. The crystal structure of G104E shows 
the clear formation of three interaction motifs defined by a 
helix-helix dimer, sheet-sheet dimer, and a loop-sheet inter- 
face. This is in contrast to SEC and SEC-MALS, where no 
G104E dimers or tetramers were seen in solution, even in 
experiments performed at protein concentrations used in the 
crystallizations. The concentration of nsp9 G104E in the crys- 
tal may be high enough to favor dimer, or there may be enough 
energy gained from formation of the crystal lattice and other 
crystal contacts to force dimerization. Hydrogen bonds form 
between the OE1 or OE2 atoms of the G104E side chain 
across the dimer interface with the main-chain nitrogen of Leu 
9. These hydrogen bonds serve to allow the formation of the 
dimer in the crystal, the equivalent of which are not accessible 
in the G104V and G100E mutants. The structure does show 
significant disruption of the dimer interface, including a wider 
separation between the two halves of the dimer and a steeper 
helix crossing angle. Together, these differences may be 
enough to disrupt the formation of a dimer in solution. 

The G104E mutant showed a higher T,,, than did the wild 
type, the G100E mutant, or the G104V mutant as measured by 
CD spectroscopy. It is unclear why the G104E mutant displays 
a T,, ~10°C higher than that for the other mutants and the 
wild type. The G104E was also the only sample that produced 
crystals which may be a reflection of its greater thermal stabil- 
ity compared to the wild type and the G1O0OE or G104V mu- 
tant. 

A recent report has described an antiparallel helix-helix 
dimer of nsp9 from HCoV-229E that is held together by a 
disulfide bridge (49). For the experiments described here, re- 
ducing agents were maintained in wild-type and mutant nsp9 
samples in all experiments. However, wild-type nsp9 samples 
showed no evidence of disulfide bond formation and could be 
dissociated into monomers under oxidizing conditions in solu- 
tion, as monitored by high-resolution SEC. This may represent 
a significant difference between the function of nsp9 in SARS- 
CoV and HCoV-229E. 

As suggested by Sutton et al. (57), it is likely that the sheet- 
sheet dimer formed in the center of the tetramer is not 
biologically relevant. The new loop-sheet interface and the 
helix-helix dimer bury roughly equivalent surface areas (~800 
A?) and make approximately the same number of intermolec- 
ular contacts. The new loop-sheet interface does have contacts 
from residues D78, D47, and K86 that are moderately con- 
served among various CoVs. The sheet-sheet dimer has a sig- 
nificantly smaller buried surface area but has more H-bonds 
and nonbonded contacts for stabilization. 

Interestingly, the formation of the sheet-sheet dimer flanked 
by the loop-sheet interface within the tetramer results in the 
formation through crystallographic symmetry of a continuous 
helical arrangement of nsp9 along the b axis of the crystal. This 
arrangement of tetramers exposes a near-continuous patch of 
positively charged surface wrapping around the overall helical 
structure, suggesting the potential for binding of long stretches 
of RNA. Although this tetramer was not observed in solution, 
it does suggest that nsp9 is capable of forming polymeric struc- 
tures that might act as scaffolds for binding genomic RNA. 
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Summary. Crystallography is an essential tool for the study 
of protein structure and function. However, the biological im- 
plications of structures are often not clear. In the case of the 
SARS-CoV nsp9 RNA-binding protein, several different dimer 
interfaces have been proposed based on interactions seen in 
the crystal. We have shown that the conserved helix-helix 
dimer interface containing a GXXXG protein-protein interac- 
tion motif is biologically relevant to SARS-CoV replication. 
Disruption of this interface by site-directed mutagenesis of the 
glycine residues in the GXXXG motif resulted in monomeric 
forms of nsp9 in solution. Subsequent introduction of muta- 
tions into SARS-CoV by reverse genetics showed that the 
formation of the nsp9 dimer was necessary for viral viability. A 
single mutant G104V reverted to wild-type nsp9, indicating 
significant evolutionary and functional pressure on the dimer- 
ization interface. The RNA-binding affinity of monomeric nsp9 
mutants was reduced but not eliminated, indicating that the 
dimer retains a slight advantage over the monomer in RNA 
binding. The inability of the nsp9 monomers to function in vivo 
may not reflect an RNA-binding ability but rather the correct 
positioning of RNA in the replication complex requiring a 
properly dimerized nsp9. This view is supported by the crystal 
structure of G104E nsp9 in which the helix-helix dimer inter- 
face is significantly disrupted from wild-type nsp9. Future ex- 
periments combining crystallography and in vitro biophysical 
analysis of protein-protein and protein-RNA interactions will 
be necessary to unravel the complicated structure of the CoV 
replication complex. Collectively, our data add to a growing 
body of literature that implicates nsp9 as a key ingredient that 
intimately engages other proteins in the replicase complex to 
mediate efficient virus transcription and replication. 
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